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Abstract 

Differential encoding techniques are fast and easy to implement. 
However, a major problem with the use of differential encoding 
for images is the rapid edge degradation encountered when us- 
ing such systems. This makes differential encoding techniques 
of limited utility especially when coding medical or scientific 
images, where edge preservation is of utmost importance. We 
present a simple, easy to implement differential image coding 
system with excellent edge preservation properties. The coding 
system can be used over variable rate channels which makes it 
especially attractive for use in the packet network environment. 


Introduction 


The transmission and storage of digital images requires an enor- 
mous expenditure of resources, necessitating the use of compres- 
sion techniques. These techniques include relatively low com- 
plexity predictive techniques such as Adaptive Differential Pulse 
Code Modulation (ADPCM) and its variations, as well as rel- 
atively higher complexity techniques such as transform coding 
and vector quantization [1.2). Most compression schemes were 
originally developed for speech and their application to images is 
at times problematic. This is especially true of the low complex- 
ity predictive techniques. A good example of this is the highly 
popular ADPCM scheme. Originally designed for speech [3], it 
has been used with other sources with varying degrees of suc- 
cess. A major problem with its use in image coding is the rapid 
degradation in quality whenever an edge is encountered. Edges 
are perceptually very important and occur quite often in most 
images. Therefore, the degradation of edges can be perceptuallv 
very annoying. If the images under consideration contain medi- 
cal or scientific data, the problem becomes even more important, 
45 edges provide position information which may be crucial to 
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the viewer. This poor edge reconstruction quality has been a 
major factor in preventing ADPCM from becoming as popular 
for image coding as it is for speech coding. 

While good edge reconstruction capability is an important 
requirement for image coding schemes, another requirement that 
is gaining in importance with the proliferation of packet switched 
networks, is the ability to encode the image at different rates. 
In a packet switched network, the available channel capacity is 
not a fixed quantity, but rather fluctuates as a function of the 
load on the network. The compression scheme must therefore be 
capable of operating at different rates as the available capacity 
changes. This means that it should be able to take advantage 
of increased capacity when it becomes available while providing 
graceful degradation when the rate decreases to match decreased 
available capacity. 

In this paper we describe a DPCM based coding scheme 
which has the desired properties listed above. It is a low com- 
plexity scheme with excellent edge preservation in the recon- 
structed image. It takes full advantage of the available channel 
capacity providing lossless compression when sufficient capacity 

is available, and very graceful degradation when a reduction in 
rate is required. 


The DPCM system consists of two main blocks, the quantizer 
and the predictor (see Fig. ]). The predictor uses the correlation 
between samples of the waveform to predict the next sample 
value. This predicted value is removed from the waveform at 
the transmitter and reintroduced at the receiver. The prediction 
error is quantized to one of a finite number of values which is 
coded and transmitted to the receiver. The difference between 
the prediction error and the quantized prediction error is called 
the quantization error or the quantization noise. If the channel 
is error free, the reconstruction error at the receiver is simplv the 
quantization error. To see this, note (Fig. 1) that the prediction 
error e(k) is given by 

'(*) = *(*) - p{k) ( 1 ) 

where the predicted value is given by 


P(*) = E«A*' 
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s(k) = c,(k) + p(k). (3) 

Assuming an additive noise model, the quantized prediction 
error * 9 (Jr) can be represented as 

«,(*) ««(*) + «,(*) (4) 

where n 9 (Jk) denotes the quantization noise. The quantized pre- 
diction error is coded and transmitted to the receiver. If the 
channel is noisy this is received as e ? (A*) which is given by 

««(*) = ««(*) + "<(*■') ( 5 ) 

where n c (it) represents the channel noise. The output of the 
receiver s[k) is thus given by 



m 

= p( k ) + 

*.(*) 

(6) 

where 






p\ k ) 

= p(k) + 

n f (k) 

(-) 

the additional term «.(/:) 

being the result 

of the introduction 

of channel 

noise into the prediction 

process. 

Usir.g (1), (A), (5), 

and (7) in 

(6) we obtain 





*(*) = + 

■ n,(k) + 

»c (*) + 

»,(*). (8) 


If the channel is error free, the last two terms in (8) drop out 
and the difference between the original and reconstructed signal 
is simply the quantization error. 

When the prediction error is small, it falls into one of the 
inner levels of the quantizer, and the quantization noise is of a 
type referred to as granular noise. If the prediction error falls in 

one of the outer levels of the quantizer, the incurred quantization 
error is called overload noise. Because of the way the granular 
noise is generated it is generally smaller in magnitude than the 
overload noise and is bounded by the size of the quantization 
interval. The overload noise on the other hand is essentially 
unbounded and can become very large depending on the size of 
the prediction error. As edge pixels are rather difficult to predict, 
the corresponding prediction error is generally large, and this 
leads to large overload noise values. Furthermore, because this 
error effects not only the reconstruction of the current pixel, 
but also future predictions, the prediction errors corresponding 
to the next few' pixels also tend to be large, leading to an edge 
“smearing” effe'ct. 

Reduction of the edge degradation can therefore be obtained 
by reducing or eliminating the slope overload noise. Reduc- 
tion of the slope overload noise can be obtained by improving 
the prediction process. Gibson [4] analyzed ADPCM systems 
with backward adaptive prediction, and showed that the track- 
ing ability of the adaptive predictor can be improved by the 
addition of zeros in the predictor. Motivated by these results, 
Sayood and Schekall [5] designed ADPCM systems for image 
coding with ARMA predictors. Their results show that some 
reduction in the edge degradation is possible with the use of 
adaptive zeros in the predictor. While the use of these predic- 
tors improves the edge reconstruction there is still significant 
degradation in the edges. One technique to further improve the 
edge performance was developed by Schekall and Sayood [6], 
which uses the Jayant quantizer as an edge detector. The over- 
load noise is then reduced by sending a quantized representation 


of the noise through a side channel. The advantage of this ap 
proach is that it can be added to existing ADPCM systems. 
The disadvantage is that the use of a side channel introduces 
synchronization problems. In this paper we propose a different 
approach for edge preservation which does not require a side 
channel. This approach is described in the following section. 

Proposed Approach 

The approach taken in this paper is a variation on the standard 
rate-distortion tradeoff. The basic idea is that the slope over- 
load noise can be reduced by increasing the rate. However rather 
than increasing the rale for encoding each and every pixel, there 
is only an instantaneous rate increase whenever slope overload 
is encountered. Th* way this is implemented is outlined in the 
block diagram of Figure 2. A DPCM system is followed by a 
lossless encoder at the transmitter. At the receiver the inverse 
operations are performed. The DPCM system differs from stan- 
dard DPCM systems in that the quantizer being used has an 
unlimited number of levels. In practice what this means is that 
if the input has 256 levels, which is standard for monochrome 
images, then the DPCM quantizer will have 512 levels. This 
effectively eliminates the overload noise making the distortion 
a function of the quantizer stepsize A. Of course by itself it 
also eliminates any compression that may have been desired, in 
fact it requires an increase of one bit in the rate. The com- 
pression is obtained by use of the lossless encoder. The lossless 

encoder output alphabet consists of .V codewords. These code- 
words correspond to A r consecutive levels in the quantizer. Let 
the smallest level be labeled xi and the largest level be labeled 
zh . If the quantizer output e ? (/:) is a level between zi and 
t//, then the lossless encoder puts out the corresponding chan- 
nel symbol. If, however, e 9 (*) is greater than xh the encoder 
puts out the symbol corresponding to xh- A new value 
is then obtained by subtracting xh from e 9 (A*). If this value is 
less than xh then it is encoded using the corresponding code- 
word in the lossless encoder output alphabet. Otherwise, zh is 
again subtracted from to generate e 9 :(k)* This process is 

continued till some e 9yi (fc) where 

V.(*) = ” 73 

and e fn (/:) is less than x//. A similar strategy is followed when 
« 9 (/r) ^ Thus the instantaneous rate is increased by a func- 
tion of n whenever the prediction error falls outside the dosed 
interval [x£,x/y]. 

Example : Consider a DPCM system with a stepsize A of 2 
where the input output relationship is given by 

<?[x] = 2* if 2* - 1 < x < 2k+ 1; k = 0, ±1, ±2, . . . 

Let the lossless encoder output alphabet be of size eight with 
xi = -4, and xh = 10. If the input e(/r) is 7 the quantiger out- 
put tq{k) is 8, which is in the losslessNencoder output alphabet 
and therefore this value is encoded as a single codeword. If e(fc) 
is 15 then e 9 (/r) is 16, which is larger than xh- In this case, 
the encoder puts out the codeword corresponding to j h and 
generates e 9 j(*) = 26—20 = 6 which is in the encoder output 
aJphabet. Therefore, the encoder output consists of two code- 
words representing x>/(10) and 6. If the input is — 7, e^(k) = —6 
which is less than xj,. Thus the lossless encoder output consists 
of two symbols. One corresponding to the value of 2 l(-4 ) and 
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one corresponding to the value of -2. Note that if the input is 
10 or -4 (i.e. xh or ii) then the output will be the sequence 
10,0 or -4,0. 

One of the consequences of this type of encoding is that it 
can generate runs of xl and x» whenever the image contains 
a large number of edges. Fortunately the encoding scheme also 
provides a significant number of special symbols that can be used 
to encode these runs. For example, the sequence xh followed by 
a negative value and the-sequence xi followed by a positive value 
would not occur in the normal course of events. These sequences 
can therefore be used to encode the runlengths of x i and x # . 
Consider for example a system in which A is 2 and x L is - 4. 
The output of the lossless encoder therefore corresponds to the 
values —4, -2, 0,2, 4. In the standard system a value of 4 is 
always followed by a value of 0 or 2. Similarly a value of -4 is 
always followed by a value of 0 or —2. Therefore. the sequences 
4-2 and -4 + 2 can be used as special symbols to denote runs of 
4 or -4. A simple strategy is to replace every two 4’s^or -4’s) 
after the initial 4 by a -2 (or 2). For example a value of 10 
would still be represented by 4 4 2. However a value of 14 would 
be represented by 4-2 2 instead of 4 4 4 2. Similarly a value 
of 18 would be represented by 4 -2 4 2 and a value of 22 would 

be represented by 4 -2 *2 2. For this particular scheme, a run of 
length n would be represented by n - codewords. When 

the size of the lossless encoder output is increased, the number 
of special symbols available also increases and the coding of the 
runs can be performed more efficiently. 

These special sequences can also be used to signal a change 
of rate for applications in which the available channel capacity 
changes with time. The actual change can be accomodated by 
changing the stepsize and reducing the lossless encoder codebook 
size by the same amount. Several of the systems proposed above 
were simulated. The results of these simulations are presented 
in the next section. 


Results 


Four systems of the type described in the previous section have 
been Simulated. Two of the systems simulated use a one tap 
fixed predictor, while the other two use a one pole four zero 
predictor with the zeros being adaptive. One of the systems in 
each case contains the lossless encoder followed by a runlength 
encoder while the other contains only the lossless encoder with- 
out the runlength encoder. The test images used were the USC 
GIRL image, and the USC COUPLE image. Both are 256 X 
256 monochrome eight bit images and have been used often as 
test images. The objective performance measure were the Peak 
Signal to Noise Ratio (PSNR) and the Mean Absolute Error 
(MAE) which are defined as follows: 


PSNR = 101og 10 


255* 

<(s(*))-i(*)’> 


MAE =< \s{k) — s(k)\ > 

w-here < • > denotes the average value. 

Several initial test runs were performed using different num- 
ber of levels, different values of xi and different values of A 
to get a feel for the optimum values of the various parameters 
(Given xi and A, xh is automatically determined.). We found 
that an appropriate way of selecting the value of xi was using 
the relationship 


XL = 



where [x] is the largest integer less than or equal to x, and N 
is the size of the alphabet of the lossless coder. This provides 
a symmetric codebook when the alphabet size is odd, and a 
codebook skewed to the positive side when the alphabet size is 
even. The zero value is always in the codebook. 

As the alphabet size is usually not a power of two, the binary 
code for the output alphabet wJU be a variable length code. The 
use of variable length codes always bring up issues of robustness 
with respect to changing input statistics. With this in mind, 
the rate was calculated in two different ways. The first was to 
find the output entropy, and scale it up by the ratio of symbols 
transmitted to the number of pixels encoded. We call this rate 
the entropy rate, which is the minimum rate obtainable if we 
assume the output of the lossless encoder to be memoryless. 

While this assumption is not necessarily true, the entropy rate 
gives us an idea about the best we can do with a particular 
system. We will treat it as the lower bound on the obtainable 
rate. We also calculated the rate using a predetermined variable 
length code. This code was designed with no prior knowledge 
of the probabilities of the different letters. The only assumption 
was that the letters representing the inner levels of the quantizer 
were always more likely than the letters representing the outer 
levels of the quantizer. The code tree used is shown in Figure 3. 
Obviously, this will become highly inefficient in the case of small 
alphabet size and small A, as in this case, the outer levels x L 
and xh u'3U occur quite frequently. This rale can be viewed as 
an upper bound on the achievable rate. 

The results for the system with a one tap predictor and with- 
out the runlength encoder are shown in Tables 1 and 2. Table 1 
contains the results for the COUPLE image, while Table 2 con- 
tains the results for the GIRL image. In the table Ri denotes 
the entropy rate while Ru is the rate obtained using the Huffman 
code of Figure 3. Recall that for image compression schemes, 
systems with PSNR values of greater than 35 dB are percep- 
tually almost identical. As can be seen from the PSNR values 
in the tables there is very little degradation with rate, and in 
fact if we use the 35 dB criterion there is almost no degrada- 
tion in image quality until the rate drops below two bits per 
pixel. This can be verified by the reconstructed images shown 
in Figure 4. Each picture in Figure 4 consists of the original 
image, the reconstructed image and the error image magnified 
10 fold. In each of the pictures, it is extremely difficult to tell 
the source or original image from the reconstructed or output 
image. In fact, in the case of the image coded at rates above 
two bits per pixel it is well nigh impossible. This subjective ob- 
servation is supported by the error images in each case which 
are uniform in texture throughout without any of the standard 
edge artifacts which can be usually seen in the error images for 
most compression schemes. 

We can see from the results that if the value of A and hence 
xl is fixed, the size of the codebook has no effect in on the perfor- 
mance measures. This is because the only effect of reducing the 
codebook size under these conditions is to increase the number 
of symbols transmitted. While this has the effect of increasing 
the rate, because of the way the system is constructed, it does 
not influence the resulting distortion. The drop in rate for the 
same distortion as the alphabet size increases can be dearly seen 
from the results in Tables ] and 2. 
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Table 3 shows the decrease in rate when a simple runlength 
coder is used. The runlength coder encodes long strings of x L 
and *h using l ^ e spe c5al sequences mentioned previously. As 
can be seen from the results the improvement provided by the 
current runlength encoding scheme is significant only for small 
alphabets and small values of A. This is because it is under 
these conditions that most of the long strings of xi and xh are 
generated. However we are not as yet using many of the special 
sequences in the larger alphabet codebooks, so there is certainly 
room for improvement. 

The one tap predictor was replaced with an adaptive ARMA 
predictor with a fixed pole and four adaptive zeros. The fixed 
pole was at a lag of 257 (pixel above) while the zeros were at 
lags of one, two, three and four. The adaption was performed 
using a sample LMS algorithm as follows. Let B* be the vector 
of predictor coefficients at time k. The adaption algorithm was 

Bjc+l “ Bk “ r 

where p is the adaption stepsize and 

E, = (e,(fc - 1)- *,{fc - 2), e,{* - 3). e,(* - 4)) 7 . 

The results from using this predictor are shown in Tables 4, 5 
and 6. While there is some improvement in all cases, the results 
for the COUPLE imace show a greater improvement than the 
results for the GIRL image. This can be explained by noting 
that the COUPLE image contains many more edges than the 
GIRL ima-e. As the ARMA predictor tends to improve predic- 
tor performance when edges are encountered, the improvement 
in performance occurs in the image with more edges. 

P reclusion 
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,Ve l ave demonstrated a simple image coding scheme which is 
• ery easy to implement in realtime and has excellent edge preset- 
ration properties over a wide range of rates. 

This system would be especially useful in transmitting mv 
?.rcs over channels where the available bandwidth may be \*Ty 
The edge preserving quality is especially useful in the encoding 
of scientific and medical images. 
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Table 1: Performance results for tbe COUPLE image, alphabet size 3, 5 and 8. 
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Table 2: 


Performance results for the GIRL image, alphabet site 3, 5 and 8. 
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Table 4*. Performance 


results for COUPLE image with adaptive ARMA predictor. 
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Table 5: Performance 


results for GIRL image with adaptive ARMA predictor.. 
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Table 6: Comparison of Entropy rates between systems with and without 
the RunJength Encoder for the COUPLE image. 



FiTjre 4(a). GIRL image codec at entropy rate of 1.7 bpp. 



Figure 4(b). GIRL image coded at entropy rate of 1.5 bpp. 
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Appendix 2- Item 7 



